Skip to content

Display VTT transcripts in audio/video players#7418

Open
eltiffster wants to merge 43 commits intomainfrom
av-transcripts
Open

Display VTT transcripts in audio/video players#7418
eltiffster wants to merge 43 commits intomainfrom
av-transcripts

Conversation

@eltiffster
Copy link
Copy Markdown
Contributor

@eltiffster eltiffster commented Apr 17, 2026

Summary For Release Notes

Allow users to select and display WebVTT files as transcripts/captions for audio/video files.

If using Flexible Metadata please add the following to your profile.

properties:
  transcript_ids:
    available_on:
      class:
        - Hyrax::FileSet
    indexing:
      - transcript_ids_ssim
    data_type: array
    display_label: Transcripts
    form:
      primary: false
    property_uri: http://vocabulary.samvera.org/ns#transcriptIds

Guidance Setup:

  • Create a work with an audio/video representative file set and a VTT file set.
  • Once both files have been uploaded, go to the edit page for the audio/video file set. There should be a form field titled "Transcripts" that lists your VTT file as an option. Select your VTT file and click Save.
  • If IIIF/AV Support is not enabled (via the Features Dashboard):
    • Go to a file set or work page
    • Click the 3 dots in the audio/video player and click "Captions" to enable captions.
  • If IIIF/AV Support is enabled, you'll need to first install the Clover IIIF viewer since UV doesn't support captions at time of writing:
    • In your webapp (.dassie or /app/samvera/hyrax-webapp in Docker), run rails generate hyrax:iiif_viewer clover to install
    • NOTE: If you previously installed Clover, you'll need to run rails destroy hyrax:iiif_viewer clover, then rails generate hyrax:iiif_viewer clover to reinstall fresh copies of the viewer files. The clover.js file was altered to fix this issue and an extra class was added to a div in clover.html.erb.
    • Once Clover is installed and the transcript has been saved to the audio/video file, go the work page. You should be able to display the captions by clicking the 3 dots in the audio/video player toolbar and enabling them. If you click on the "Annotations" tab, you should also see the interactive transcript with timestamps (example pictured below).
screenshot of the interactive transcript in Clover's Annotations tab, side-by-side with the video player

Type of change (for release notes)

  • notes-major (I think) due to needing to install a new gem for converting language field values into a language code readable by the IIIF viewer

Detailed Description

This is a continuation of work started before/during the March 2026 Community Sprint. More context/discussion on implementation was recorded on the the Sprint Board.

After uploading VTT file(s) to a work with an audio/video (AV) file, users can choose to use the VTT file as the subtitles/captions file for the corresponding audio/video. This is done by editing the AV file set, selecting the VTT file by title, and saving the AV file set (see screenshot below). Under the hood, the VTT file set ids are saved to the transcript_ids attribute of the AV file and indexed as transcript_ids_ssim in the AV file's Solr document.

The transcript(s) form is populated by a Solr query that searches for "sibling" file sets (i.e. file sets of the same parent work as the AV file set) with a text/vtt mime type. Currently, text/vtt is the only accepted mime type, but other mime types could be added in future. Users can select multiple VTT files per audio/video file. This approach also supports a nested work structure where a child work has a different transcript than other child works or its parent work.

captions_edit_interface

Selected transcripts are displayed via a <track> element in default audio/video partials. When using a IIIF AV viewer, the transcript is displayed in a IIIF manifest via an annotation, following the pattern of this IIIF cookbook recipe:

"annotations": [
  {
   "type": "AnnotationPage",
   "items": [
      {
         "type": "Annotation",
         "motivation": "supplementing",
         "body": {
              "id": "http://localhost:3000/transcripts/file_id.vtt",
              "type": "Text",
              "format": "text/vtt",
              "label": {
                 "none": ["title_or_label.vtt"]
               }
           },
        "id": "http://localhost:3000/concern/generic_works/id/manifest/canvas/id",
        "target": "http://localhost:3000/concern/generic_works/id/manifest/canvas/id"
      }
    ]
  }
 ]

Thanks to @kirkkwang and @trmccormick for their work on this!

Changes proposed in this pull request:

  • Add transcript_ids and language properties/attributes to ActiveFedora and Hyrax file set classes, their respective form classes, and their respective indexers
  • Add UI hints/guidance for transcript_ids and language form fields in the file set edit form
  • Pass a Hyrax::FileSetPresenter to render_media_display_partial instead of a Solr document (in app/views/hyrax/file_sets/edit.html.erb)
  • Add VTT transcripts as annotations to the IIIF manifest and sort them by language/locale by @kirkkwang
  • Add homepage property to IIIF manifest by @kirkkwang
  • Add route and controller for serving transcripts to IIIF AV viewers by @kirkkwang
  • Add LanguageList gem for parsing 2-letter language code from a Solr document's language field
  • Add <track> elements to app/views/hyrax/file_sets/media_display/_video.html.erb and app/views/hyrax/file_sets/media_display/_audio.html.erb: initial idea and technical plan by @trmccormick

Possible Future Work

  • Localization/translation for UI hints for the language and transcript fields
  • The file set edit form simply renders the views/hyrax/file_sets/_form.html.erb partial for ActiveFedora file sets, while the Valkyrie version uses the hydra-editor gem, which renders partials in views/records/edit_fields. Is there a reason for this difference? Should the form be refactored?
  • Ideally, Clover should be easier to configure, maybe with a JSON object/file like with Universal Viewer. This would require using a JS framework to modify it and rebuild the clover.js file.

@samvera/hyrax-code-reviewers

…notations tab of Clover IIIF viewer.

"l" is supposed to refer to the <ItemStyled> object in Annotation (https://github.com/samvera-labs/clover-iiif/blob/main/src/components/Viewer/InformationPanel/Annotation/Item.styled.tsx). In the minified version of this file, this should actually be uppercase L, not lowercase l.

`l("span",{style:{backgroundImage` was changed to `L("span",{style:{backgroundImage`
`return l(n9,{dir:P,"data-format"` was changed to `return L(n9,{dir:P,"data-format"`

There appears to be an unnecessary call to `l()` in the following switch/case statement:
```
case "text/vtt":
  return l(HQ, {
    inlineCues: k,
    label: A,
    vttUri: ((D = y[0]) == null ? void 0 : D.id) || void 0
  });
```

In fact, `return l(HQ,{inlineCues:k` can be changed to `return HQ({inlineCues:k` since HQ is a function in the minified file.
…ion for other file types seems unfinished and doesn't work.
…f the solr document. Do not make language a required field for a file set.
… Remove tests for code that was already tested/covered elsewhere.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 17, 2026

Test Results

    17 files  ±    0       1 errors   16 suites   - 1   3h 21m 57s ⏱️ - 7m 57s
 7 362 tests +   26   7 056 ✅ +   26  306 💤 ± 0  0 ❌ ±0 
23 314 runs   - 1 372  22 748 ✅  - 1 347  566 💤  - 25  0 ❌ ±0 

For more details on these parsing errors, see this check.

Results for commit 97da895. ± Comparison against base commit cae6920.

This pull request removes 447 and adds 473 tests. Note that renamed tests count towards both.
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007efd87f90160>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f43517afc88>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f940c3e3d50>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007fb14017d498>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007efd880cb2c8>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f43517bc1e0>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f940c403218>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007fb14085e258>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy AdminSet: 03bcd766-44a7-4c85-a114-cc0a95c5566b
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy Hyrax::AdministrativeSet: 7913a7b4-338a-4103-bd3e-5d961108ea33
…
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f294ddba510>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f4c1dcb0b40>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007fa794f8f780>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007fec576d5010>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f294ddf4120>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f4c1dcee620>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007fa794f9be18>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007fec57cbcbe0>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy AdminSet: 48f52c14-0ddc-4a03-bd5a-45ec15465319
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy Hyrax::AdministrativeSet: 2696814f-a7dd-4f50-b7b1-a30c0d9f469d
…

♻️ This comment has been updated with latest results.

@eltiffster eltiffster changed the title [DRAFT] Display VTT transcripts in audio/video players Display VTT transcripts in audio/video players Apr 22, 2026
@eltiffster eltiffster marked this pull request as ready for review April 22, 2026 20:32
kirkkwang
kirkkwang previously approved these changes Apr 23, 2026
Copy link
Copy Markdown
Contributor

@kirkkwang kirkkwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, i think the only thing is to maybe take out the ActiveTriples::Resource case, it doesn't seem likely as a case at the moment anyways.

Copy link
Copy Markdown
Member

@orangewolf orangewolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great work. I have a few questions and concerns, but the feature as a whole looks great. Let me know if you want to pair up on the m3 part or the file streaming.

Comment thread app/controllers/hyrax/transcripts_controller.rb Outdated
Comment thread app/models/concerns/hyrax/file_set/transcripts.rb Outdated
Comment thread app/presenters/hyrax/iiif_manifest_presenter.rb
Comment thread app/services/hyrax/file_set_type_service.rb
Comment thread app/views/hyrax/file_sets/media_display/_video.html.erb
…transcripts.

It was created for a hypothetical edge case in which someone might configure a controlled vocabulary field (e.g. Hyrax::ControlledVocabularies::Language) in an AF-based app. Removed since it doesn't refer to any existing use case in Hyrax.
@eltiffster
Copy link
Copy Markdown
Contributor Author

This is really great work. I have a few questions and concerns, but the feature as a whole looks great. Let me know if you want to pair up on the m3 part or the file streaming.

Thanks, @orangewolf! I'm not very familiar with the flexible metadata features, so I'd appreciate any help I can get.

I tried adding transcript_ids to config/metadata_profiles/m3_profile.yml like so:

properties:
  transcript_ids:
    available_on:
      class:
        - Hyrax::FileSet
    indexing:
      - transcript_ids_ssim
    data_type: array
    display_label: Transcripts
    form:
      primary: false
    property_uri: http://vocabulary.samvera.org/ns#transcriptIds

And also to both .dassie and .koppie in the same place. But I don't see transcript_ids field if I download the default profile through the interface. What am I doing wrong?

@eltiffster
Copy link
Copy Markdown
Contributor Author

eltiffster commented Apr 27, 2026

This is really great work. I have a few questions and concerns, but the feature as a whole looks great. Let me know if you want to pair up on the m3 part or the file streaming.

Thanks, @orangewolf! I'm not very familiar with the flexible metadata features, so I'd appreciate any help I can get.

I tried adding transcript_ids to config/metadata_profiles/m3_profile.yml like so:

properties:
  transcript_ids:
    available_on:
      class:
        - Hyrax::FileSet
    indexing:
      - transcript_ids_ssim
    data_type: array
    display_label: Transcripts
    form:
      primary: false
    property_uri: http://vocabulary.samvera.org/ns#transcriptIds

And also to both .dassie and .koppie in the same place. But I don't see transcript_ids field if I download the default profile through the interface. What am I doing wrong?

So I had to delete the first default profile from the database, which allowed Hyrax::FlexibleSchema.create_default_schema to pick up the new field.

How should this change be handled for those who have already created a flexible schema?

Should there be documentation advising people to add transcript_ids to their flexible metadata profile? Or a button that compares the default properties to the current profile's properties and creates a new schema with the added properties?

@eltiffster eltiffster requested a review from orangewolf April 29, 2026 23:43
@orangewolf
Copy link
Copy Markdown
Member

So I had to delete the first default profile from the database, which allowed Hyrax::FlexibleSchema.create_default_schema to pick up the new field.

How should this change be handled for those who have already created a flexible schema?

Should there be documentation advising people to add transcript_ids to their flexible metadata profile? Or a button that compares the default properties to the current profile's properties and creates a new schema with the added properties?

I updated the description so we can catch it in the release notes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Annotations tab fails in Vanilla JS

3 participants